847 research outputs found
Continuous Decomposition of Granularity for Neural Paraphrase Generation
While Transformers have had significant success in paragraph generation, they
treat sentences as linear sequences of tokens and often neglect their
hierarchical information. Prior work has shown that decomposing the levels of
granularity~(e.g., word, phrase, or sentence) for input tokens has produced
substantial improvements, suggesting the possibility of enhancing Transformers
via more fine-grained modeling of granularity. In this work, we propose a
continuous decomposition of granularity for neural paraphrase generation
(C-DNPG). In order to efficiently incorporate granularity into sentence
encoding, C-DNPG introduces a granularity-aware attention (GA-Attention)
mechanism which extends the multi-head self-attention with: 1) a granularity
head that automatically infers the hierarchical structure of a sentence by
neurally estimating the granularity level of each input token; and 2) two novel
attention masks, namely, granularity resonance and granularity scope, to
efficiently encode granularity into attention. Experiments on two benchmarks,
including Quora question pairs and Twitter URLs have shown that C-DNPG
outperforms baseline models by a remarkable margin and achieves
state-of-the-art results in terms of many metrics. Qualitative analysis reveals
that C-DNPG indeed captures fine-grained levels of granularity with
effectiveness.Comment: Accepted to be published in COLING 202
Query-Efficient Black-Box Red Teaming via Bayesian Optimization
The deployment of large-scale generative models is often restricted by their
potential risk of causing harm to users in unpredictable ways. We focus on the
problem of black-box red teaming, where a red team generates test cases and
interacts with the victim model to discover a diverse set of failures with
limited query access. Existing red teaming methods construct test cases based
on human supervision or language model (LM) and query all test cases in a
brute-force manner without incorporating any information from past evaluations,
resulting in a prohibitively large number of queries. To this end, we propose
Bayesian red teaming (BRT), novel query-efficient black-box red teaming methods
based on Bayesian optimization, which iteratively identify diverse positive
test cases leading to model failures by utilizing the pre-defined user input
pool and the past evaluations. Experimental results on various user input pools
demonstrate that our method consistently finds a significantly larger number of
diverse positive test cases under the limited query budget than the baseline
methods. The source code is available at
https://github.com/snu-mllab/Bayesian-Red-Teaming.Comment: ACL 2023 Long Paper - Main Conferenc
Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
Recent studies have proposed unified user modeling frameworks that leverage
user behavior data from various applications. Many of them benefit from
utilizing users' behavior sequences as plain texts, representing rich
information in any domain or system without losing generality. Hence, a
question arises: Can language modeling for user history corpus help improve
recommender systems? While its versatile usability has been widely investigated
in many domains, its applications to recommender systems still remain
underexplored. We show that language modeling applied directly to task-specific
user histories achieves excellent results on diverse recommendation tasks.
Also, leveraging additional task-agnostic user histories delivers significant
performance benefits. We further demonstrate that our approach can provide
promising transfer learning capabilities for a broad spectrum of real-world
recommender systems, even on unseen domains and services.Comment: 14 pages, 5 figures, 9 table
- …